Home |
| Latest | About | Random
# 34 Change of basis: Part 2 Matrix representations with respect to ordered basis sets. We now look at how to represent a linear map as a matrix with respect to different ordered basis sets. ## The standard matrix of a linear map, revisited. Let us start with a linear map $T:\mathbb{R}^{k} \to \mathbb{R}^{n}$. From before we know we can write down its $n\times k$ **standard matrix** $[T]_{\text{std}}$, where for every vector $\vec x \in \mathbb{R ^{k}}$, we have $$ T(\vec x) = [T]_{\text{std}}\vec x. \qquad(\ddagger) $$But we should be careful, what do we mean by $\text{std}$ as a subscript? It is describing the standard unit vectors $\vec e_{1},\vec e_{2},\ldots$ in the usual order, but $\vec e_{i}$ is not the same in $\mathbb{R}^{k}$ as in $\mathbb{R}^{n}$. So let us (unfortunately) adopt a slightly more cumbersome notation, denote $$ \text{std}_{k}=(\vec e_{1},\vec e_{2},\ldots,\vec e_{k}) $$to be the standard ordered basis for $\mathbb{R}^{k}$, where $\vec e_{i}$ is the $i$-th column of the identity matrix $I_{k\times k}$. And likewise $\text{std}_{n}=(\vec e_{1},\vec e_{2},\ldots,\vec e_{n})$ for $\mathbb{R}^{n}$. Then observe that for every vector $\vec x \in \mathbb{R}^{k}$, we have $\vec x = [\vec x]_{\text{std}_{k}}$. And for every vector $\vec y \in \mathbb{R}^{n}$, we have $[\vec y]_{\text{std}_{n}}$. So we can re-write equation $(\ddagger)$ as $$ [T(\vec x)]_{\text{std}_{n}} = [T]_{\text{std}_{k}}^{\text{std}_{n}}[\vec x]_{\text{std}_{k}}, $$where we re-write the (slightly ambiguous) notation of $[T]_{\text{std}}$ as $[T]_{\text{std}_{k}}^{\text{std}_{n}}$, to represent that this is the matrix representation of $T$, that takes in inputs that are vectors in the standard representation of dimension $k$, and returns outputs that are vectors in the standard representation of dimension $n$. So, > For a linear map $T:\mathbb{R}^{k}\to \mathbb{R}^{n}$, the **standard matrix** of $T$ is the matrix denoted as $[T]_{\text{std}_{k}}^{\text{std}_{n}}$, where for every vector $\vec x\in \mathbb{R}^{k}$, we have $$ T(\vec x) = [T]_{\text{std}_{k}}^{\text{std}_{n}}\vec x, $$and to be more explicit with the notation, $$ [T (\vec x)]_{\text{std}_{n}} = [T]_{\text{std}_{k}}^{\text{std}_{n}} [\vec x]_{\text{std}_{k}}. $$And recall we can calculate this standard matrix directly column-wise as $$ [T]_{\text{std}_{k}}^{\text{std}_{n}} = [\ T(\vec e_{1}) \ |\ T(\vec e_{2})\ | \ \cdots | \ T(\vec e_{k})\ ]. $$ ## The matrix of a linear map with respect to different basis sets. Now consider a linear map $T:\mathbb R^{k}\to\mathbb R^{n}$, and suppose we have ordered basis set $\beta=(\vec b_{1}, \vec b_{2}, \ldots, \vec b_{k})$ for $\mathbb R^{k}$ and ordered basis $\gamma = (\vec g_{1},\vec g_{2},\ldots,\vec g_{n})$ for $\mathbb R^{n}$. If we express input vector $\vec x$ in $\mathbb R^{k}$ in terms of its coordinate vector $[\vec x]_{\beta}$ with respect to $\beta$, and express the output vector $T(\vec x)$ in terms of its coordinate vector $[T\vec x]_{\gamma}$ with respect to $\gamma$, what is the relation between $[\vec x]_{\beta}$ and $[T\vec x]_{\gamma}$? Diagrammatically we want to figure out what is the matrix denoted with question mark $[??]$ below: $$ \begin{array}{r|c} \begin{array}{} \underset{(\text{std}_{k})}{\mathbb R^{k}} & \xrightarrow{[T]_{\text{std}_{k}}^{\text{std}_{n}}} & \underset{(\text{std}_{n})}{\mathbb R^{n}} \\ &\quad\quad\quad & \\ \underset{(\beta)}{\mathbb R^{k}} & \xrightarrow{[??]} & \underset{(\gamma)}{\mathbb R^{n}} \end{array} & \begin{array}{} \vec x & \xrightarrow{[T]_{\text{std}_{k}}^{\text{std}_{n}}} & T(\vec x) \\ &\quad\quad\quad & \\ [\vec x]_{\beta} & \xrightarrow{[??]} & [T(\vec x)]_{\gamma} \end{array} \end{array} $$In above diagram, the left half the the diagram is the relation between the linear spaces, while the right column shows the relation between the elements, and we want to know what is $[??]$ such that $$ [T(\vec x)]_{\gamma} = [??] [\vec x]_{\beta} $$ Well we know how to connect the top row with the bottom row, with change of basis matrices $P_{\beta}$ and $P_{\gamma}$ ! Recall if we write $P_{\beta} = [\ \vec b_{1}\ |\ \vec b_{2}\ |\ \cdots \ |\ \vec b_{k}\ ]$ and $P_{\gamma}= [\ \vec n_{1}\ |\ \vec n_{2}\ |\ \cdots \ |\ \vec g_{n}\ ]$, then $$ \begin{array}{} \vec x = P_{\beta} [\vec x]_{\beta} & \text{for all \(\vec x \in \mathbb R^k\)}, \\ \vec y = P_{\gamma} [\vec y]_{\gamma} & \text{for all \(\vec y \in \mathbb R^n\)}, \end{array} $$ And using the fact that $P_{\beta}$ and $P_{\gamma}$ are both invertible, we can fill in the arrows to join the top row with the bottom row in above diagram:$$ \begin{array}{r|c} \begin{array}{} \underset{(\text{std}_{k})}{\mathbb R^{k}} & \xrightarrow{[T]_{\text{std}_{k}}^{\text{std}_{n}}} & \underset{(\text{std}_{n})}{\mathbb R^{n}} \\ \color{blue} P_{\beta}\uparrow\ \ \ \ \ &\quad ⟲ \quad & \ \ \ \ \ \ \color{blue} \downarrow P_{\gamma}^{-1}\\ \underset{(\beta)}{\mathbb R^{k}} & \xrightarrow{[??]} & \underset{(\gamma)}{\mathbb R^{n}} \end{array} & \begin{array}{} \vec x & \xrightarrow{[T]_{\text{std}_{k}}^{\text{std}_{n}}} & T(\vec x) \\ \color{blue} P_{\beta}\uparrow\ \ \ \ \ &\quad ⟲\quad & \ \ \ \ \ \ \color{blue} \downarrow P_{\gamma}^{-1}\\ [\vec x]_{\beta} & \xrightarrow{[??]} & [T(\vec x)]_{\gamma} \end{array} \end{array} $$ So by commutativity of function arrows, we see that $[??]$ is given by $$ [??] = P_{\gamma}^{-1}[T]_{\text{std}_{k}}^{\text{std}_{n}}P_{\beta}\ , $$and since the job of $[??]$ is to convert $[\vec x]_{\beta}$ to $[T(\vec x)]_{\gamma}$, it itself is really the map $T$, except it takes input vectors as coordinate vectors w.r.t $\beta$, and outputs answers as coordinate vectors w.r.t $\gamma$. So we write $[??]$ as $[T]_{\beta}^{\gamma}$, **the matrix representation of $T$, from basis $\beta$ to $\gamma$**. So our diagram now looks like $$ \begin{array}{r|c} \begin{array}{} \underset{(\text{std}_{k})}{\mathbb R^{k}} & \xrightarrow{[T]_{\text{std}_{k}}^{\text{std}_{n}}} & \underset{(\text{std}_{n})}{\mathbb R^{n}} \\ \color{blue} P_{\beta}\uparrow\ \ \ \ \ &\quad ⟲ \quad & \ \ \ \ \ \ \color{blue} \downarrow P_{\gamma}^{-1}\\ \underset{(\beta)}{\mathbb R^{k}} & \xrightarrow{[T]_{\beta}^{\gamma}} & \underset{(\gamma)}{\mathbb R^{n}} \end{array} & \begin{array}{} \vec x & \xrightarrow{[T]_{\text{std}_{k}}^{\text{std}_{n}}} & T(\vec x) \\ \color{blue} P_{\beta}\uparrow\ \ \ \ \ &\quad ⟲\quad & \ \ \ \ \ \ \color{blue} \downarrow P_{\gamma}^{-1}\\ [\vec x]_{\beta} & \xrightarrow{[T]_{\beta}^{\gamma}} & [T(\vec x)]_{\gamma} \end{array} \end{array} $$ In summary, > Let $T:\mathbb R^{k}\to \mathbb R^{n}$ be a linear map. > Let $\beta=(\vec b_{1},\ldots,\vec b_{k})$ be some ordered basis of $\mathbb R^{k}$. > Let $\gamma=(\vec g_{1},\ldots,\vec g_{n})$ be some ordered basis of $\mathbb R^{n}$. > Then $[T]_{\beta}^{\gamma}$ is a matrix such that for all $\vec x \in \mathbb R^{k}$, we have $$ [T]_{\beta}^{\gamma} [\vec x]_{\beta} = [T(\vec x)]_{\gamma}. $$It is related to the standard matrix $[T]_{\text{std}_{k}}^{\text{std}_{n}}$ by $$ [T]_{\beta}^{\gamma} = P_{\gamma}^{-1}[T]_{\text{std}_{k}}^{\text{std}_{n}}P_{\beta} \ , $$where $P_{\beta} = [\vec b_{1}\ |\ \cdots |\ \vec b_{k}]$ and $P_{\gamma} = [\vec g_{1}\ |\ \cdots |\ \vec g_{n}]$ are the associated change of basis matrices. There is an alternate method to compute $[T]_{\beta}^{\gamma}$, we claim that > The matrix representation $[T]_{\beta}^{\gamma}$ can be determined column-wise:$$ [T]_{\beta}^{\gamma}=\left[\ [T(\vec b_{1})]_{\gamma} \ \ \ [T(\vec b_{2})]_{\gamma}\ \ \ [T(\vec b_{3})]_{\gamma}\ \ \cdots\ \ \ [T(\vec b_{k})]_{\gamma}\ \right] $$where $\beta=(\vec b_{1},\ldots,\vec b_{k})$ be some ordered basis of $\mathbb R^{k}$, and $\gamma=(\vec g_{1},\ldots,\vec g_{n})$ be some ordered basis of $\mathbb R^{n}$. To understand why this column-wise formula is the case, note that the $i$-th column of $[T]_{\beta}^{\gamma}$ can be obtained by doing $[T]_{\beta}^{\gamma}\ \vec e_{i}$. This means the $i$-th column of $[T]_{\beta}^{\gamma}$ is $$ \begin{align*} [T]_{\beta}^{\gamma}\vec e_{i} & = [T]_{\beta}^{\gamma}[\vec b_{i}]_{\beta} \\ & =[T(\vec b_{i})]_{\gamma} \end{align*} $$Here we use the fact that the coordinate vector of $\vec b_{i}$ with respect to $\beta$ is just $\vec e_{i}$, that is, $\vec e_{i} = [\vec b_{i}]_{\beta}$. And the fact that what $[T]_{\beta}^{\gamma}$ does it to convert $[\vec x]_{\beta}$ into $[T(\vec x)]_{\gamma}$. As a further computational note, recall we can find the coordinate vector $[\vec y]_{\gamma}$ by performing the row reduction $$ [P_{\gamma}\ |\ \vec y]=[\vec g_{1} \ \vec g_{2}\ \cdots \ \vec g_{n}\ | \ \vec y\ ] \stackrel{\text{row}}\sim [ I_{n} \ |\ [\vec y]_{\gamma}] $$ where $\gamma = (\vec g_{1}, \vec g_{2},\ldots,\vec g_{n})$, we can compute all the columns of $[T]_{\beta}^{\gamma}$ by doing all row reductions all at once: > Computationally by row reduction: > If $T:\mathbb R^{k}\to \mathbb R^{n}$ is a linear map and $\beta=(\vec b_{1},\ldots,\vec b_{k})$ be some ordered basis of $\mathbb R^{k}$. > and $\gamma=(\vec g_{1},\ldots,\vec g_{n})$ be some ordered basis of $\mathbb R^{n}$. Then $[T]_{\beta}^{\gamma}$ can be found by doing the row reduction $$ [P_{\gamma}| \ T(\vec b_{1}) \ T(\vec b_{2})\cdots T(\vec b_{k})\ ] \stackrel{\text{row}}\sim [ I_{n} \ |\ [T]_{\\\beta}^{\gamma}\ ] $$where $P_\gamma = [\vec g_{1} \ \vec g_{2}\ \cdots \ \vec g_{n}\ ]$. **Example.** Let $T:\mathbb R^{2} \to \mathbb R^{3}$ be a linear map, where $$ T\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}x+y\\2x-y\\-x\end{pmatrix} $$Take $\beta = (\begin{pmatrix}2\\1\end{pmatrix},\begin{pmatrix}3\\1\end{pmatrix})$ an ordered basis for $\mathbb R^{2}$, and $\gamma = (\begin{pmatrix}1\\1\\1\end{pmatrix}),\begin{pmatrix}1\\1\\0\end{pmatrix},\begin{pmatrix}1\\0\\0\end{pmatrix})$ an ordered basis for $\mathbb R^{3}$. (A) Find $[T]_{\text{std}_{2}}^{\text{std}_{3}}$. (B) Find $[T]_{\beta}^{\gamma}$. (C) Take $\vec v = \begin{pmatrix}1\\1\end{pmatrix}$, calculate $T(\vec v)$. (D) With the same $\vec v$, calculate $[\vec v]_{\beta}$ and $[T(\vec v)]_{\gamma}$. (E) Verify that $[T]_{\beta}^{\gamma}[\vec v]_{\beta}=[T(\vec v)]_{\gamma}$ . $\blacktriangleright$ (A) To find $[T]_{\text{std}_{2}}^{\text{std}_{3}}$, which is the standard matrix of $T$, we can just find it column by column,$$ [T]_{\text{std}_{2}}^{\text{std}_{3}} = [T(e_{1})\ |\ T(e_{2})] = \begin{pmatrix}1&1 \\ 2&-1\\-1&0\end{pmatrix}. $$ (B) To find $[T]_{\beta}^{\gamma}$ there are two ways of doing it. First way using change of basis matrices $$ [T]_{\beta}^{\gamma} = P_{\gamma}^{-1}[T]_{std_{2}}^{std_{3}} P_{\beta}. $$Since we have $$ P_{\beta} = \begin{pmatrix}2&3\\1&1\end{pmatrix}, P_{\gamma} = \begin{pmatrix}1&1&1\\1&1&0\\1&0&0\end{pmatrix}, P_{\gamma}^{-1} = \begin{pmatrix}0 & 0 & 1 \\ 0 & 1 & -1\\1 & -1 & 0 \end{pmatrix}, $$ so $$ [T]_{\beta}^{\gamma} = \begin{pmatrix}0 & 0 & 1 \\ 0 & 1 & -1\\1 & -1 & 0 \end{pmatrix} \begin{pmatrix}1&1 \\ 2&-1\\-1&0\end{pmatrix}\begin{pmatrix}2&3\\1&1\end{pmatrix} = \begin{pmatrix}-2 & -3\\5 & 8\\0 & -1\end{pmatrix}. $$ (You should verify each calculation above that is omitted.) The second way is to do it column-wise, which can be obtained via row reduction $$ [P_{\gamma} \ |\ T(\vec b_{1})\ T(\vec b_{2})] \stackrel{\text{row}}\sim [I_{3}\ |\ [T]_{\beta}^{\gamma}\ ]. $$ Since we have $$ T(\vec b_{1}) = \begin{pmatrix}3\\3\\-2\end{pmatrix}, T(\vec b_{2})=\begin{pmatrix}4\\5\\-3\end{pmatrix} $$ So we have $$ \left[\begin{array}{ccc|cc} 1 & 1 & 1 & 3 & 4 \\ 1 & 1 & 0 & 3 & 5 \\ 1 & 0 & 0 & -2 & -3 \end{array}\right] \stackrel{\text{row}}\sim \left[\begin{array}{ccc|cc} 1 & 0 & 0 & -2 & -3 \\ 0 & 1 & 0 & 5 & 8 \\ 0 & 0 & 1 & 0 & -1 \end{array}\right] $$ So $[T]_{\beta}^{\gamma} = \begin{pmatrix}-2 & -3\\5 & 8\\0 & -1\end{pmatrix}$. (C) If $\vec v = \begin{pmatrix}1\\1\end{pmatrix}$, then $T(\vec v)=\begin{pmatrix}2\\1\\-1\end{pmatrix}$. (D) We have $[\vec v]_{\beta} = \begin{pmatrix}2\\-1\end{pmatrix}$. This can be found by row reducing $$ [P_{\beta} | \vec v] \stackrel{\text{row}}\sim [I_{2}\ |\ [\vec v]_{\beta}\ ]. $$ Also we have $[T(\vec v)] = \begin{pmatrix}-1\\2\\1\end{pmatrix}$, which can be found by row reducing $$ [P_{\gamma}\mid T(\vec v)]\stackrel{\text{row}}\sim[I_{3}\mid[T(\vec v)]_{\gamma}]. $$ (E) Now this demonstrate what the purpose of $[T]_{\beta}^{\gamma}$ is. If we multiply $[T]_{\beta}^{\gamma} [\vec v]_{\beta}$ we get $$ [T]_{\beta}^{\gamma}[\vec v]_{\beta} = \begin{pmatrix}-2 & -3\\5 & 8\\0 & -1\end{pmatrix}\begin{pmatrix}2\\-1\end{pmatrix}= \begin{pmatrix}-1 \\2\\1\end{pmatrix}, $$which indeed is $[T(\vec v)]_{\gamma} !$ $\blacksquare$ ## Maps whose matrix representations are square, and similar matrices. We now concern with the special case where the domain and codomain of our linear maps are both $\mathbb R^{n}$. In this case the matrix representation would be a square matrix. And in this case, we can choose the same ordered basis $\beta$ for both the domain and codomain. So, for a linear map, $T: \mathbb R^{n} \to \mathbb R^{n}$, and an ordered basis $\beta$ of $\mathbb R^{n}$, we have the diagram $$ \begin{array}{} \underset{(std)} {\mathbb R^{n}} & \xrightarrow[T]{[T]_{std}^{std}} & \underset{(std)} {\mathbb R^{n}} \\ \left._{P_{\beta}}\big\uparrow \right. \ \ \ & \quad ⟲ \quad & \ \ \ \ {\big\downarrow^{P_{\beta}^{-1}}} \\ \underset{(\beta)} {\mathbb R^{n}} & \xrightarrow{[T]_{\beta}^{\beta}} & \underset{(\beta)} {\mathbb R^{n}} \end{array} $$ which we see $$ [T]_{\beta}^{\beta} = P_{\beta}^{-1} [T]_{std}^{std} P_{\beta} $$and $$ [T]_{std}^{std}=P_{\beta}[T]_{\beta}^{\beta}P_{\beta}^{-1}. $$ When we have this pattern of two matrices $A$ and $B$ related by an equation $$ A = PBP^{-1} $$for some invertible matrix $P$, we say that $A$ is **similar** to $B$, and we write $A \stackrel{\text{sim}}\sim B$. This choice of vocabulary is descriptive and deliberate. In our case, we see that $[T]_{std}^{std}=P_{\beta}[T]_{\beta}^{\beta}P_{\beta}^{-1},$ so we would say $[T]_{std}^{std}$ is similar to $[T]_{\beta}^{\beta}$, and write $[T]_{std}^{std} \stackrel{\text{sim}}\sim [T]_{\beta}^{\beta}$. And because $[T]_{\beta}^{\beta}$ is the matrix representation of the same linear map $T$, whose standard matrix representation is $[T]_{std}^{std}$, we know they are matrices that are "doing similar things, just over different basis", whence similar! In summary, > When two square matrices $A$ and $B$ satisfy the relation $$ A=PBP^{-1} $$for some invertible matrix $P$, then we say $A$ is **similar** to $B$, and write $A\stackrel{\text{sim}}\sim B$. > When this happens, $A$ and $B$ are just matrix representations of the same linear map, but represented over different basis sets. > > Furthermore, if we think of $A=[T]_{std}^{std}$ as the standard matrix of some linear map $T$, and $B=[T]_{\beta}^{\beta}$ as the matrix representation of $T$ with respect to some ordered basis $\beta$, then the matrix $P$ such that $A=PBP^{-1}$ consists of columns precisely the vectors in the ordered basis $\beta$, with diagram $$ \begin{array}{rcl} \mathbb R^{n} & \xrightarrow{A} & \mathbb R^{n} \\ P^{-1}\big\downarrow &\quad ⟲ \quad & \big\uparrow P \\ \mathbb R^{n} & \xrightarrow{B} & \mathbb R^{n} \end{array} $$(careful with arrow directions.) This is a profound insight, because it shows even though there are a lot of square matrices, we can group them into **similarity classes**, those that are similar to one another. We will discuss more on similarity and examples in the next notes [[smc-spring-2024-math-13/linear-algebra-notes/35-similarity|notes 35 similarity]]. By the way, the little circle arrow ⟲ is just decoration to emphasize the diagram commutes, the orientation is irrelevant. Some people draw $\circ$ , or put hash marks $/ / /$, or, nothing at all. Putting nothing is most popular, since often time in the right context the diagram is understood to commute! (And we would emphasize that it does not if otherwise.) It is just style and really does nothing but decoration.